Cosmos Transfer2.5 inference pipeline: general/{seg, depth, blur, edge} by miguelmartin75 · Pull Request #13066 · huggingface/diffusers

miguelmartin75 · 2026-02-02T19:52:14Z

What does this PR do?

This PR introduces Cosmos Transfer2.5 inference pipeline, which extends the existing code in transformer_cosmos.py and introduces a new controlnet class for cosmos. The conversion script is updated to convert the checkpoints too.

I've intentionally split the controlnet from the base predict model to match the rest of the diffusers codebase. To do this, I have had to duplicate some layers/weights from the base model (relating to the patch & timestep embeddings), but I believe SD3 does this.

Similar to predict2.5, I have added documentation and unit tests.

Additional PRs will be submitted for the following features (in order of priority):

Auto-regressive inference support, currently inference can only be applied to a fix number of frames. In cosmos-transfer2.5 AR inference is performed.
Additional transfer2.5 variants:
- multi-control (multiple controlnets at once)
- auto/multiview
Image reference

In addition, unfortunately, the guardrails safety model is too aggressive: it currently flags "not safe" for the examples we have on cosmos-transfer2.5 (e.g. edge example for 93 frames is flagged). This guardrail model needs to be updated, but this work is ~orthogonal of this PR.

Who can review?

Core library:

Pipelines and pipeline callbacks: @yiyixuxu and @asomoza
Docs: @stevhliu and @sayakpaul
General functionalities: @sayakpaul @yiyixuxu @DN6

yiyixuxu

Thanks for the PR! The overall structure looks good. I left some minor comments.

One question before I can review further: Are the base transformer weights the same across the different control variants?

This helps us understand whether splitting the controlnet from the transformer makes sense (i.e., can users mix and match?), and also helps me understand whether the controlnet is required for this pipeline etc

yiyixuxu · 2026-02-02T23:10:15Z

scripts/convert_cosmos_to_diffusers.py

+    --save_pipeline
+
+# seg
+transformer_ckpt_path=~/.cache/huggingface/hub/models--nvidia--Cosmos-Transfer2.5-2B/snapshots/eb5325b77d358944da58a690157dd2b8071bbf85/general/seg/5136ef49-6d8d-42e8-8abf-7dac722a304a_ema_bf16.pt


ohh does each variant come with its own base transformer?

in diffusers we typically split controlnet from the base model is so that user can mix and match, it this something possible with cosmos?

Each variant should have the same weights as the base transformer, I will double check this, but I split out the controlnet and save the pipeline (saves base transformer + controlnet), such that the pipeline can be loaded directly from a model_id/revision.

I will look into only loading the controlnet from the converted script.

yiyixuxu · 2026-02-02T23:18:37Z

src/diffusers/pipelines/cosmos/pipeline_cosmos2_5_transfer.py

+        raise AttributeError("Could not access latents of provided encoder_output")
+
+
+def transfer2_5_forward(


Can we inline this inside the __call__ method? We typically only create separate methods for operations users might need to run standalone to pre-compute things (like encode_prompt, encode_video, etc.). It's also easier to read when you don't have to jump around the file.

yiyixuxu · 2026-02-02T23:22:55Z

src/diffusers/pipelines/cosmos/pipeline_cosmos2_5_transfer.py

+        transformer: CosmosTransformer3DModel,
+        vae: AutoencoderKLWan,
+        scheduler: UniPCMultistepScheduler,
+        controlnet: CosmosControlNetModel,


is controlnet optional here?

Yes. I will change the typehint

miguelmartin75 added 30 commits February 2, 2026 19:50

initial conversion script

dd241dc

cosmos control net block

7e475bd

CosmosAttention

1b934ff

base model conversion

b40da24

wip

cfedde1

pipeline updates

8222e9f

convert controlnet

9fefe1f

pipeline: working without controls

2b67a31

wip

5f2bab8

debugging

97f10d8

Almost working

cc6cf13

temp

4ba9945

control working

35e0653

cleanup + detail on neg_encoder_hidden_states

9da2e88

convert edge

b3852ac

pos emb for control latents

a16e81a

convert all chkpts

cd65899

resolve TODOs

dfe99b8

remove prints

aadf51a

Docs

26b7ee5

add siglip image reference encoder

d7f122d

Add unit tests

50f7e53

controlnet: add duplicate layers

c5c2456

Additional tests

9a55923

skip less

2e2fea1

skip less

bf1f99d

remove image_ref

910103f

minor

751fba4

docs

251b5c1

remove skipped test in transfer

44db782

miguelmartin75 added 2 commits February 2, 2026 19:50

Don't crash process

c1cfa9d

formatting

9b8338c

miguelmartin75 changed the title ~~Cosmos/transfer2.5~~ Cosmos Transfer2.5 inference pipeline: general/{seg, depth, blur, edge} Feb 2, 2026

miguelmartin75 added 3 commits February 2, 2026 19:59

revert some changes

b9dd0cb

remove skipped test

d09cf24

make style

2cd7f23

yiyixuxu reviewed Feb 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cosmos Transfer2.5 inference pipeline: general/{seg, depth, blur, edge}#13066

Cosmos Transfer2.5 inference pipeline: general/{seg, depth, blur, edge}#13066
miguelmartin75 wants to merge 35 commits intohuggingface:mainfrom
miguelmartin75:cosmos/transfer2.5

miguelmartin75 commented Feb 2, 2026 •

edited

Loading

Uh oh!

yiyixuxu left a comment

Uh oh!

yiyixuxu Feb 2, 2026

Uh oh!

miguelmartin75 Feb 3, 2026 •

edited

Loading

Uh oh!

yiyixuxu Feb 2, 2026

Uh oh!

yiyixuxu Feb 2, 2026

Uh oh!

miguelmartin75 Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		raise AttributeError("Could not access latents of provided encoder_output")


		def transfer2_5_forward(

Conversation

miguelmartin75 commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Who can review?

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

miguelmartin75 Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

miguelmartin75 Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

miguelmartin75 commented Feb 2, 2026 •

edited

Loading

miguelmartin75 Feb 3, 2026 •

edited

Loading